BigBench Specification V0.1 - BigBench: An Industry Standard Benchmark for Big Data Analytics

نویسندگان

  • Tilmann Rabl
  • Ahmad Ghazal
  • Minqing Hu
  • Alain Crolotte
  • Francois Raab
  • Meikel Pöss
  • Hans-Arno Jacobsen
چکیده

In this article, we present the specification of BigBench, an end-to-end big data benchmark proposal. BigBench models a retail product supplier. The benchmark proposal covers a data model and a set of big data specific queries. BigBench’s synthetic data generator addresses the variety, velocity and volume aspects of big data workloads. The structured part of the BigBench data model is adopted from the TPC-DS benchmark. In addition, the structured schema is enriched with semistructured and unstructured data components that are common in a retail product supplier environment. This specification contains the full query set as well as the data model.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Discussion of BigBench: A Proposed Industry Standard Performance Benchmark for Big Data

Enterprises perceive a huge opportunity in mining information that can be found in big data. New storage systems and processing paradigms are allowing for ever larger data sets to be collected and analyzed. The high demand for data analytics and rapid development in technologies has led to a sizable ecosystem of big data processing systems. However, the lack of established, standardized benchma...

متن کامل

A BigBench Implementation in the Hadoop Ecosystem

BigBench is the first proposal for an end to end big data analytics benchmark. It features a rich query set with complex, realistic queries. BigBench was developed based on the decision support benchmark TPC-DS. The first proof-of-concept implementation was built for the Teradata Aster parallel database system and the queries were formulated in the proprietary SQL-MR query language. To test oth...

متن کامل

Introducing TPCx-HS: The First Industry Standard for Benchmarking Big Data Systems

Discussion of BigBench: A Proposed Industry Standard Performance Benchmark for Big Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 Chaitanya Baru, Milind Bhandarkar, Carlo Curino, Manuel Danisch, Michael Frank, Bhaskar Gowda, Hans-Arno Jacobsen, Huang Jie, Dileep Kumar, Raghunath Nambiar, Meikel Poess, Francois Raab, Tilmann Rabl, Nishkam Ravi, Kai Sachs, Sapta...

متن کامل

Star Schema Benchmark (ssb)

Big Data Analytics Benchmark (BigBench). Tags: pdgf Tags: star schema benchmark, ssb, parallel data generation framework, pdgf, benchmarking, skew. relational models which have been for a few years the most used to support classical data warehousing applications such as Star Schema Benchmark (SSB). Star. Schema Benchmark (6) is recently proposed datawarehousing benchmark that has been implement...

متن کامل

Characterizing BigBench Queries, Hive, and Spark in Multi-cloud Environments

BigBench is the new standard (TPCx-BB) for benchmarking and testing Big Data systems. The TPCx-BB specification describes several business use cases —queries— which require a broad combination of data extraction techniques including SQL, Map/Reduce (M/R), user code (UDF), and Machine Learning to fulfill them. However, currently, there is no widespread knowledge of the different resource require...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012